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Introduction 


Neighborhoods have enduring impacts on the people who live in them. They affect a wide range 


of social and personal experiences of their residents — from schooling, to access to health care 


and quality food, to social relationships and supports, and to civic participation. Therefore, 


understanding the characteristics of neighborhoods is important for informing policy decisions 


about how to most equitably and efficiently allocate services, supports, and resources. But 


determining how to characterize neighborhoods is not a straightforward task; they are multi- 


faceted, complex units made up of disparate factors such as people, physical environment, 


cultural norms, institutions, and businesses, just to name a few. 


The process and results of defining neighborhoods 
may look different depending on the purpose. In 

this brief, we focus on how we — a team of education 
researchers — defined neighborhoods in Chicago. The 
initial goal of this work was to enable us to answer 
questions about how access to and enrollment in 
school-based pre-kindergarten (pre-k) may have 
varied by neighborhood characteristics. In conducting 
this work, we found the question of how to describe 
neighborhoods particularly interesting. 

We used a data-driven method for characterizing 
neighborhoods that leveraged publicly available 
census data and allowed us to consider many neigh- 
borhood characteristics simultaneously. This method 
resulted in a parsimonious set of five neighborhood 
groupings in Chicago that enabled us to simplify how 


we understood the relationship of neighborhood char- 


acteristics to a host of educational and other outcomes. 
In fact, our school district colleagues recommended 
that we produce this brief as a resource to those seek- 
ing to apply similar approaches in their work. 

This brief shares our approach to defining neighbor- 
hood groupings and their characteristics and our find- 
ings about them. First, we review prior neighborhood 
research and ways of characterizing Chicago com- 
munity areas and neighborhoods, outlining how the 
current work builds on these previous efforts. We then 
present our data-driven method for grouping neigh- 
borhoods and describe our neighborhood groupings. 
Finally, we discuss the potential uses and implications 
of our work. We hope that others in Chicago (or else- 
where) find this approach to defining neighborhood 
groupings useful — both for conducting research stud- 


ies and providing services or resources. 
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The Importance of Neighborhoods 


in Prior Research 


Many researchers from disparate fields such as 
education, sociology, and psychology have studied 
the relationship between neighborhood character- 
istics and children’s outcomes. For example, much 
research has found a direct link between neighbor- 
hood poverty and poorer academic, behavioral, and 
health outcomes.' Other work has detailed the diverse 
mechanisms — including social relationships, norms, 
and institutional resources such as preschool quality — 
through which neighborhood characteristics can 
positively shape outcomes as well.?, The resounding 
conclusion from these decades of research is that 
where we live, what is located near us, and who our 


neighbors are matter for many aspects of our lives. 


Seminal Work on Defining 
Neighborhoods in Chicago 


A classic study conducted by Robert Sampson (and 
many colleagues), called the Project on Human 
Development in Chicago Neighborhoods, identified 
many relationships between neighborhood character- 
istics and the individual and collective experiences 
and outcomes of their residents. The book that 
resulted from this ambitious study, Great American 
City: Chicago and the Enduring Neighborhood 
Effect?, provides scientific evidence of the impact 

of neighborhoods. 

Sampson does not provide a single definition of the 
term “neighborhood,” but he does point out important 
themes that run through many historical definitions. 
One theme is location — neighborhoods are geographic 
units that are embedded in larger units such as cities. 
The ability for residents to interact in person may also 
be a feature of a neighborhood, meaning they are 
usually small enough units to allow residents to do 


so. A second is identity and connections — they are 


characterized by a sense of social identity that is often 
defined by factors beyond location, such as race, eth- 
nicity, and social class, or by the combinations of these 
factors. Neighborhoods may also gain an identity from 
what they are not — not poor, for example. 

In Chicago, there are 77 official community areas, 
which have been in continual use since the early 1920's 
when they were first created.* However, although 
these community areas connect geographically to 
specific locations in the city and have defined bound- 
aries, they are geographically too big to be considered 
neighborhoods in Sampson's terms. In this sense, 
Chicago’s 77 community areas do not define the city’s 
neighborhoods because there is much diversity within 
many community areas and great variability from one 
neighborhood to another within a given community 
area. This suggests that to better understand their 
characteristics, neighborhoods in Chicago should be 
defined on a smaller scale. 

In contrast, Sampson and his research colleagues 
identified 343 neighborhoods in Chicago by combin- 
ing two to three adjacent census tracts (out of a total 
of 866) from the 2000 U.S. Census. Census tracts 
are small (typically they contain just 1,200 to 8,000 
residents), relatively homogeneous geographic units 
used by the Census Bureau to present statistical data.> 
These census tracts were not combined arbitrarily; the 
researchers considered boundaries (highways, rivers, 
railroad tracks) and demographic characteristics of 
the nearby census tracts to ensure that they combined 
relatively similar tracts. Sampson's approach helped 
to identify Chicago’s individual neighborhoods with 
much greater specificity than is captured by the of- 
ficial community areas. Yet, this greater specificity and 
accuracy also created a new problem — 343 neighbor- 
hoods are harder to understand, analyze, and create 


policy for than are 77 community areas. 
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How the Current Work Builds 
On Prior Examinations of 
Neighborhoods 


To account for this complication, our approach to 
grouping neighborhoods differs from Sampson’s in 
a fundamental way. Rather than grouping census 
tracts together based on geographic proximity, we 
are defining “neighborhood groupings” based on 
shared sociodemographic characteristics — empha- 
sizing Sampson's second definitional theme, identity, 
and connections over location. That is, our groupings 
include census tracts where people share many simi- 
larities in terms of race/ethnicity, income, and other 
characteristics, but they may have scattered locations. 
To do so, we used a data driven approach (detailed in 
Appendix A) to group the 797 census tracts in Chicago 
defined by the 2010 U.S. Census that had residents 
with similar characteristics. 

We took an analytic approach that is referred to 
as a “person-centered” approach (or “neighborhood- 
centered” in this case) as opposed to the more typical 
“variable-centered” approach used by most prior studies 
of neighborhoods. In the latter case, neighborhood 
characteristics like those listed above are explored 
independently as predictors of relevant outcomes. For 
example, we might explore whether the concentration 
of Hispanic or Black residents in children’s home neigh- 
borhoods is predictive of their enrollment in pre-k, hold- 
ing constant levels of neighborhood poverty, language 
status, etc. While helpful in certain circumstances, 
trying to “disentangle” often-related neighborhood 
characteristics leaves us with results that are difficult to 
interpret. With our neighborhood-centered approach, 
we can make fewer and more intuitive comparisons 
among neighborhoods by considering a multitude of 
neighborhood characteristics simultaneously to under- 
stand how these variables, viewed in combination, 
relate to outcomes of interest. For example, using a 
neighborhood-centered analysis, we can compare pre-k 
enrollment patterns across a small number of different 


kinds of neighborhood groupings. 


Although neighborhood-centered approaches are 
not very common in education research, we are not 
the first to attempt this work. For example, researchers 
have used this method for grouping neighborhoods 
to look at how neighborhoods are related to health 
outcomes.® There is also a small number of examples 
from psychology.” For example, one study used 
census data from 1990 to categorize neighborhoods 
based on several dimensions, including violence, 
disadvantage, and collective efficacy, to explore 
how neighborhood groupings were associated with 
adolescent antisocial behavior.® Like ours, these stud- 
ies all drew from large datasets (such as the census) 
to characterize neighborhoods based on factors other 
than location. Unlike our work, however, these stud- 
ies tended to focus on older children and did not 
consider educational outcomes (with one exception’). 
Importantly, most prior work has grouped neighbor- 
hoods based on measures of structural or relational 
(dis)advantage (e.g., housing problems, green space, 
neighborhood disorder, violence, etc.). Instead, we are 
seeking simply to describe neighborhoods in terms 
of the people who live in them. 

We conducted this investigation of how to catego- 
rize neighborhoods into meaningful groupings in 
order to facilitate a research study that examined the 
relationship between access to Chicago Public Schools 
pre-k classrooms and students’ actual enrollment.!° 
Our research questions asked not only who enrolled 
in pre-k but also asked how geographic location and 
— more importantly — how the neighborhood context 
of children’s residences was related to pre-k access 
and enrollment among different student groups. Thus, 
we were focused on several descriptors of residential 
neighborhoods that are important for the district to 
understand when making decisions about how best 
to support students’ enrollment and success in school. 
These included typical indicators such as race/ethnic- 
ity and income and employment of residents," as well 
as the prevalence of bilingual speakers and other lin- 
guistic characteristics of residents in neighborhoods. 


In addition, given previous research 
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in Chicago” we considered the education level and 
occupation of residents® within neighborhoods. This 
enabled us to examine access and enrollment patterns 
within and across different groupings of neighbor- 
hoods comprised of people like each other on these 
characteristics. Understanding neighborhoods based 
on individual resident characteristics (as opposed to 


structural or environmental characteristics) is useful in 


two ways. First, it helps to characterize the contexts 
in which students live and thus the ways that their 
residential neighborhood might impact their out- 
comes and experiences in school (as we describe 
above). And second, it can serve as a “shorthand” 
method for identifying locations in which one is likely 
to find high concentrations of students who would 


benefit from similar kinds of supports and services. 


Method 


The Census Variables Used to 
Group Census Tracts Into 
Neighborhood Groupings 


We used 12 different variables at the tract level from 
the from the 2012 American Community Survey 5-year 
estimates to conduct the analysis©: four measure 
race/ethnicity; four measure language and place of 


birth; two measure income level and employment 


RACE/ETHNICITY 

@ Percent Asian 

@ Percent Black 

@ Percent Hispanic (non-White) 
a 


Percent White 


LANGUAGE AND PLACE OF BIRTH 

m Percent Who Speak English Well 

@ Percent Bilingual 

m Percent Who Speak Only Another Language 
(not English) 


@ Percent Foreign-Born 


(combined into one variable); and two measure edu- 
cation and occupation (combined into one variable). 
The technical names and table location within the 
American Community Survey files of each variable 
are contained in Table B.1 in Appendix B. The actual 
analysis used only ten variables, as the income and 
employment variables were combined into one, as 


were the education and occupation variables. 


INCOME AND EMPLOYMENT 

(Combined Into One Variable?) 

® Percent of Families with Income Above the 
Poverty Level 


= Percent of Employed Males 


EDUCATION AND OCCUPATION 
(Combined Into One Variable) 
= Mean Level of Education in Years (over 25 years old) 


H Percent Employed as Management, Professionals 


A Weused two variables created at the University of Chicago 
Consortium on School Research: Income and Employment 
(combined into one variable) and Education and Occupation 
(combined into one variable). Each of these is composed of 
the two census items noted in the text. Documentation for 
the variables can be found in Bryk, Sebring, Allensworth, 
Luppescu, & Easton (2010). They have been used regularly 
and successfully since then. 

B_ Brooks-Gunn, J., Duncan, G.J., Leventhal, T., & Aber, J.L. (1997). 
Lessons learned and future directions for research on the neigh- 
borhoods in which children live. Brooks-Gunn, J., Duncan, G.J., 
and Aber, J.L. (eds,) Neighborhood Poverty, 1, 279-297. 


C The American Community Survey is a product of the US Census 
Bureau that provides vital information on a yearly basis about 
the nation and its people through a nationally, state represen- 
tative survey. Neighborhood type results from the 2012 five-year 
estimates were verified, and confirmed, by re-running analyses 
with the 2015 five-year estimates. 2012 was chosen because it 
was the year that made the most sense for the time period of 
interest of the Pre-K study. 

D Theincome and employment variables are combined and 
negatively coded to create one variable. 
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Why We Chose These Specific 
Census Variables 


We chose variables most aligned with the intent of 
expanding pre-k access for students most likely to 
benefit from pre-k enrollment. These included “high 
priority” students — students of color, those speaking 
a language other than English, and those living in neigh- 
borhoods with lower income and higher unemploy- 
ment. Several policies prioritized these students and 
the neighborhoods in which they lived; we therefore 
wanted to identify neighborhoods by using similar in- 
dicators that the policies attended to. We also included 
a measure of education level and employment that has 
proven to be related to positive student outcomes in 


prior UChicago Consortium research.” 


Analytic Technique Used to 
Group Census Tracts 


Our groupings of census tracts were defined by using 
an advanced statistical technique® that identifies 
neighborhoods that are similar in terms of the 10 census 
variables listed above and captures complexities of 
neighborhoods better than other statistical approaches. 
This is desirable from a research and analytic perspective 
given that it results in fewer variables to consider in our 
analyses. These new groupings are also easier to under- 
stand and describe. The statistical technique is a special 
case of a family of methods called “mixture models” 
(see Appendix A for statistical detail and software code). 
There are two main goals of this type of analysis: 
(1) Create groups in which the within-group similarities 
in census characteristics are maximized, while at the 
same time, (2) Maximize the between-group differences. 
That is, each grouping of census tracts will contain the 
other tracts that are most like them and exclude the 


ones least like them. 


Results 


Our work resulted in the identification of five distinct 
groupings of census tracts in Chicago, some of which 
are scattered across the city with others more geo- 
graphically concentrated. The five groupings are rela- 
tively easy to describe, and they are easily understood 
by those familiar with Chicago. We found that using 
the five neighborhood groupings added considerable 
value to our research project and made our findings 
easier to interpret than if we had used the “variable- 
centered” approach. 

Note that Chicago’s longstanding residential 
segregation patterns are reflected in these neighbor- 
hood groupings. For a comprehensive examination, 
we highly recommend the following report from the 


Institute for Research on Race and Public Policy at the 


University of Illinois Chicago: A Tale of Three Cities: The 
State of Racial Justice Report in Chicago." Here is an 
extensive quotation from the summary of that report: 
“The central finding of this report is that racial and 
economic inequities in Chicago remain pervasive, 
persistent, and consequential. These inequities affect 
the lives of Chicagoans in every neighborhood; they 
have not just spatial but also deep historical roots 
and are embedded in our social, economic, politi- 
cal institutions; and they have powerful effects on 
the experiences and opportunities of all Chicagoans. 
The patterns..... are stark, if not entirely surprising. 
Chicagoans of all racial and ethnic groups want to live 
in safe and healthy communities where they don’t just 


subsist or survive, but not all have equal access.” 


E Thetechnical name for this analytic method is latent profile 
analysis; see Appendix A for details. 
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That report provides a deep look into the context 
of Chicago neighborhoods and it also reflects one of 
our motivations for undertaking this effort to describe 
those neighborhoods in terms of who the residents 
are. We acknowledge that our groupings, as described 
below, reflect these long-standing inequities. High- 
lighting the sociodemographic characteristics of the 
residents in these neighborhoods is intended to 
provide useful information that can inform policy 


and supports to all communities across the city. 


Five Chicago Neighborhood 
Groupings Descriptions 


The five neighborhood groupings are as follows (listed 
in order of highest percentage of census tracts includ- 
ed in each). We deliberately chose not to give them 
descriptive names, given that the intent of this brief is 
to highlight the technique and purpose for grouping 
similar neighborhoods in Chicago, rather than naming 
them. We leave it to readers to choose the most 


appropriate names for their specific purposes. 


@ Group 1. (31% of census tracts"). This neighborhood 
group contains tracts almost entirely comprised of 
Black residents; nearly all residents are native-born 
and speak English well. These neighborhoods have 
the lowest proportion of households with incomes 
above the federal poverty level, employed males, 


and individuals in managerial jobs. 


') Group 2. (23% of census tracts). This neighborhood 
group contains tracts with high percentages of White 
residents, relatively few Black and Hispanic residents, 
and some Asian residents. Most residents in these 
tracts speak English well, few are bilingual or other 
language speakers, and few are foreign-born. These 
neighborhoods have the highest proportion of house- 
holds with incomes above the federal poverty level, 
employed males, individuals in managerial jobs, and 


highest average years of education. 


@ Group 3. (19% of census tracts). This neighborhood 
group contains tracts with high percentages of Hispanic 
residents and relatively few Black, White, and Asian 
residents. About two-thirds of residents in these tracts 
speak English well (a small proportion relative to the 
city of Chicago), and more than one third of residents 
are bilingual, speak a language other than English, 
and/or are foreign-born. In these neighborhoods, the 
proportion of households with incomes above the 
federal poverty level is similar to Chicago’s city-wide 
average. These tracts have higher than average male 
employment rates, but residents are less likely to hold 
jobs in managerial roles, and have the lowest average 


years of education. 


BH Group 4. (18% of census tracts). This neighborhood 
group contains tracts in which almost half of residents 
are White, and another third are Hispanic. These tracts 
also have the largest proportion of Asian residents 
relative to tracts in other neighborhood groups and 
Chicago’s city-wide average. About three-quarters of 
residents in these tracts (a small proportion relative to 
the city of Chicago) speak English well, and many are 
bilingual, soeak a language other than English, and are 
foreign-born. In these neighborhoods, the proportions 
of households with incomes above the federal poverty 
level and employed males are higher than Chicago’s 
city-wide average. These tracts are similar to the 
city-wide average in terms of years of education and 


percentage of individuals in managerial jobs. 


@ Group 5. (9% of census tracts). This neighborhood 
group contains tracts that are most racially diverse; 
they are half Black, one-fifth White, and one-fifth 
Hispanic, on average. Tracts in this group are similar 

to the Chicago-wide average on all variables, although 
slightly fewer residents than average are bilingual or 
speak a language other than English. These neighbor- 
hoods have a slightly smaller proportion of households 
with incomes above the federal poverty level, and 


more residents employed in managerial jobs. 


F_ This is percent of census tracts, not percent of the population. 
This is because the size of census tracts vary greatly. 
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Details About the Five Neighborhood Groupings 


Table 1 provides a breakdown of census variables by neighborhood grouping and the average 


for census tracts within Chicago. 


TABLE 1 


Average neighborhood group population characteristics (c. 2012). 


Variable Group 1 fet felt] oy Group 3 Group 4 City of Chicago 
(31%) (23%) he }Ze)) Utsieze)) Census Tract 
Average % 
®@ Black 94.4% 5.5% 5.5% 6.7% 52.0% 37.8% 
©! White 2.1% 75.3% 13.5% 47.4% 20.4% 30.8% 
Hispanic 2.6% 10.6% 78.5% 30.8% 20.7% 25.3% 
Asian 0.4% 6.9% 2.4% 13.4% 5.8% 5.05% 
® Speak English Well 98.6% 93.7% 60.1% 76.5% 88.8% 85.4% 
© Bilingual 2.9% 14.1% 37.5% 31.6% 16.9% 19.1% 
_ Speak Only Another 
Language 1.7% 6.4% 39.5% 19.3% 10.9% 14.4% 
(not English) 
Foreign-Born 2.2% 14.1% 40.2% 32.7% 17.0% 18.8% 
© Income 
(Families with Income 69.3% 95.0% 77.7% 86.3% 76.0% 80.4% 
Above the Poverty 
Level) 
© Employment 53.2% 88.2% 79.2% 81.1% 71.9% 72.8% 


(Employed Males) 


! Education 
(Average Level of 12.8 15.6 10.9 SES) 1S Se! 
Education in Years) 


') Occupation 
(Employed as 25.6% 57.8% 15.9% 35.3% 37.2% 34.0% 
Management) 
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Figure 1 below provides values — in standard devia- 
tion units — for the 10 different census variables across 
the five neighborhood groupings (recall that Income 
and Employment were combined into one variable 
as were Education and Occupation). In this figure, 
each neighborhood grouping has ten bars — one for 
each of the 10 census variables used in the analysis. 
Bars above the zero line have higher values than the 
Chicago average for that variable, and correspond- 
ingly, bars below the zero line have lower values than 
the Chicago average for that variable. The graph uses 
“standard deviation units” where zero is equal to the 
average value, and +1.00 SD units is roughly equivalent 
to the 84th percentile ranking, and -1.00 SD units is 
roughly equivalent to the 16th percentile ranking. 

To provide an example for understanding Figure 1, 
we “talk through” one neighborhood grouping — 
Group 3. This is the third most prevalent neighborhood 


grouping in Chicago and comprises 19% of census 


FIGURE 1 


tracts. It is also the neighborhood grouping with the 
most extreme values on six of the ten variables. These 
six (the highest and lowest bars) are Percent Hispanic, 
(1.81 SD above the city average), Soeak Other Language 
(1.63 SD above average), Speak English Well (1.62 SD 
below average), Foreign-Born (1.31 SD above average), 
Bilingual (1.22 SD above average), and Education and 
Occupation (1.17 SD below average). The three other 
race/ethnicities — Black, White, and Asian — are also 
below the city average (at 0.79 SD, 0.56 SD, and 0.30 
SD below, respectively). Only one combined variable, 
Income and Employment, is almost the same as the 
city average with a value of 0.05 SD units. See Table D.1 
in Appendix D for precise standard deviation values 
for all neighborhood groupings. 

The information contained within Table 1 and 
Figure 1 informed the narrative descriptions of the 


five neighborhood groupings above. 


Characteristics of the five neighborhood groupings relative to Chicago averages. 
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TU lll 
Group 3 


(19%) 


{Income and Employment 
{Education and Occupation 


Speak Only Another Language 


Our Neighborhood Grou pings that large swaths of the south and west sides are 


Mapped Onto the 77 Chicago Group 1 neighborhoods; that many north-side, lake- 
Community Areas front neighborhoods are Group 2 neighborhoods; 


and that Group 3 neighborhoods are found near 
The geographic locations of the five neighborhood the north-west and southwest areas of the city. Also, 


groupings are shown in the Chicago map below, which there are many Group 4 and Group 5 neighborhoods 


also shows the boundaries of its 77 community areas. spread across the city, especially inbetween more 
Those who know Chicago will recognize in Figure 2 homogenous areas of the city. 
FIGURE 2 


Map of Chicago with the five neighborhood groupings mapped onto the 77 community areas. 
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Conclusion 


This “neighborhood-centered” rather than a “variable-centered” approach to the analysis enabled 


us to understand how neighborhoods in Chicago can be thought of in terms of the people who 


live in them. Our work provides just one example of how these methods can be useful to cities, 


school districts, and policy makers across the country. Future work can apply a similar approach 


to meet different aims. 


Using this method for creating neighborhood group- 
ings has aided our own research study in several ways. 
First, we can look at the city in a more fine-grained 
way by combining similar census tracts rather than 
the more typical use of Chicago's 77 community areas, 
allowing us to see variation within community area 
more clearly. We can also examine outcomes (e.g., 
pre-k access and enrollment) more easily by neigh- 
borhood groupings than we could have with a more 
conventional approach that would have been more 
complicated and difficult to interpret. This is because 
we are less interested in how the 12 census variables 
independently relate to child outcomes and are more 
interested in how the combined variables defined 
neighborhood groupings and how the neighborhood 
groupings influence outcomes. 

But ours is just one application of this approach. 
The aims of future work using similar methods will 
differ. For example, in this analysis we used data from 
the 2012 American Community Survey and replicated 


the analysis using 2015 data and found nearly identical 


results. This analysis could easily be re-run with the 
most recent census data from the 2018 American 
Community Survey, to track how neighborhoods shift 
over time. It is also important to note that a similar 
analysis could be conducted using different census 
variables or by using other geo-coded data from differ- 
ent sources. Because we aimed to support the school 
district in understanding and improving students’ 
access and enrollment in pre-k, our work focused on 
describing neighborhoods in terms of the people who 
live in them. However, one could easily envision creat- 
ing neighborhood groupings for other purposes that 
instead describe the neighborhood's physical environ- 
ment or resources, such as the presence of park land, 
playgrounds, libraries, museums, and other cultural 
institutions. Work focused on public health might 
include the presence of grocery stores and community 
gardens, or air quality indicators. 

The opportunities for describing neighborhoods are 
great and should be tailored to the specific decisions 


to be made and research applications of users. 
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Appendix A 


Modeling 


To conduct the Latent Profile Analysis (LPA), our pre-k 
access and enrollment study team utilized MPlus® 

to run a mixture model with a maximum likelihood 
estimator." LPA is a special case of mixture modeling 
in which the latent profiles (or what we call “neigh- 
borhood groupings” above) explain the relationships 
among the observed continuous dependent variables 
(neighborhood variables about characteristics of 
residents) through a set of linear regression equa- 
tions. The key modelling choices one has with an LPA 
are which variables to include, and how many differ- 
ent profiles the data should be grouped into. We ran 
models investigating what the data looked like with 
2-8 profiles, upping the number of random starts as 
necessary to ensure the best loglikelihood replication 
and that we were not achieving a local maximum! 
Additionally, we ran two sets of models; one using 
variables pulled directly from ACS including a vari- 
able for poverty level (% of those living below 150% 
of the federal poverty level), and another using ver- 
sions of the ACS variables standardized to the census 
tracts in the City of Chicago with two variables often 
calculated by the UChicago Consortium (Income 

and Employment combined into one variable and 
Education and Occupation Combined into one 


variable) in place of the poverty level variable. 


Fit. In LPA, and structural equation models (SEMs) 
more broadly, there is quite a bit of subjectivity in 
choosing the model with the best fit. Below are the 
different indicators of model fit we used to help 
decide on which model (how many profiles) to move 
forward with. While some of the indicators provided 
justification for selecting 6 or 7 profiles, we ended 
up deciding on 5 profiles as this most closely aligned 
with our knowledge of Chicago, performed well on 
the fit indicators, all while maintaining a substantive 
percentage of tracts per group. From the classification 
probabilities shown in Table A.1, we can see that the 
5-profile model does a particularly good job of defin- 
ing profiles that are very different from one another 


(top to bottom diagonal). 


& Log-likelihood (LL) value: Higher values (closer to O) 
indicate better fit 


@ Lo-Mendell-Rubin (LMR) likelihood ratio test: Used 
to compare models with different numbers of 
clusters; significant p-value indicates that the more 
complex model (with more clusters) fits ‘better’ 


= Adjusted BIC: Smaller values indicate better fit; can 
compare non-nested models, but gives no p-value 


# Entropy: Used to represent how well the posterior 
probabilities were collectively able to confidently 
classify individuals; Higher entropy indicates greater 
confidence 


G TheMPLUS code use for this analysis can be found in Appendix A. 


H “Mixture modeling refers to modeling with categorical latent 
variables that represent subpopulations where population 
membership is not known but is inferred from the data. With 


continuous latent class indicators, the means of the latent class 


indicators vary across the classes as the default.” MPLUS User 
Guide Chapter 7 


I In MPLUS for low numbers of classes we found that Starts= 200 
50 (representing the number of initial stage starts and number 
of final stage optimizations) was sufficient to ensure a stable 
solution, however it was necessary to ramp up the number of 
random starts to Starts= 2000 500 to ensure convergence for 
models with the highest number of classes. The creators of 
MPLUS recommend doubling the number of starts even after 
convergence as a check to make sure a local maxima was not 
reached, which we did 
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TABLE A.1 
LPA model fit statistics and classification probabilities. 


LL Value | mR | Adj BIC Entropy % Sample per Class 


-9125.136 4806.328, p<.01 18358.937 0.972 7, 29 
3 -6937.886 4315.774, p<.01 14022.996 0.985 SO ST ne, 
4 -6282.513 1293.149, p<.01 12750.808 0.975 25, 19, 21, 36 
5 -5812.806 926.803, p<.01 11849.953 0.982 OFS 2 5198 
6 -5424.681 765.830, p<.1 11112.26 0.985 Oy 227, Sl OZ. 
7 -5071.515 696.849, p<.542 10444.487 0.974 15778; 2075) 1) Si 12 
8 -4806.424 523.064, p<.03 9952.864 0.981 4, 20, 32, 11, 14, 12, 6,3 


TABLE A.2 


Classification probabilities for five cluster LPA model. 


Most Likely Latent Class Membership 


1 2 3 4 5 
1 0.993 Oo 0.007 O Oo 
2 0 0.98 0.003 0.004 0.012 
3 0.009 0.002 0.977 0.012 O 
4 Oo 0.002 0.013 0.984 Oo 
5 O 0.003 Oo O 0.997 
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2012 v 2015 Census Data 


The same five neighborhood groupings were generated 
when using the 2012 vs. 2015 census data. However, 
roughly 12% of census tracts changed their group 
membership between 2012 and 2015. For this study 
on pre-k access and enrollment, it was determined to 


simply use the 2012 group classification, with robust- 


TABLE A.2 
Census tract distributions using 2012 vs 2015 data. 


ness checks being made using the 2015 data. Even 
though about 12% of census tracts changed classifica- 
tion, the percentage of tracts in each neighborhood 
type remained relatively stable, as can be seen in 
Table A.2. 


Group 1 Group 2 
xo] Group 1 234 2 
Su 
2) 
‘= £ Group 2 3 170 
2 = Group 3 ] 
° 
2 6 Group 4 2 
ve 
Group 5 7 6 
Total 247 187 


MPLUS CODE (FOR ABOVE ANALYSIS) 
TITLE: 
DATA: 
VARIABLE: 


Census latent profile analysis; 

FILE IS data/2015lpa_zscored.csv; 
NAMES ARE Id2 pblack pwhite phisp 
pasian pengwell pbiling pothlang 
pfborn incemp edocc; 

USEVAR ARE pblack pwhite phisp pasian 
pengwell pbiling pothlang pfborn incemp 
edocc; 

MISSING ARE ALL (9999); 

CLASSES = c(5); 

TYPE = MIXTURE; 

ESTIMATOR = MLR; 

Starts= 600 150; 

FILE IS 2015/pprob_c5_zscored.dat; 
SAVE = CPROBABILITIES; 


ANALYSIS: 


SAVEDATA: 
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yAOV IEW C-Vhe la) Led aavelore Mel cele] late 


Group 3 Group 4 Group 5 Total 
2 O 13 Z5) 

] 183 

7 149 
2) 143 

156 142 65 US) 


ih 


Appendix B 


Variables 


TABLE B.1 


Variables used to create the five neighborhood groupings and corresponding ACS data file and 


definitions. 


Variables Used for Five 


i Kefelaletedaaterele Meieeltl o}iate ty 


ACS Data File 
(2012 5-year estimates) 


Specific ACS Variable 
Definition or Calculation 


®@ Black 


| White 
Hispanic 


Asian 


® Speak English Well 


© Bilingual 


'. Speak Only Another 
Language 
(not English) 


Foreign-Born 


© Income 
(Families with Income 
Above the Poverty Level) 


® Employment 
(Employed Males) 


! Education 
(Level of Education in Years) 


') Occupation 
(Employed as Management) 


Table SO6O1 


Selected characteristics of the 
total and native populations in 
the United States 


Table S1601 
Language spoken at home 


Table BO5002 


Place of birth by nativity and 
citizenship status 


Table B17010 


Poverty status in the past 12 months 
of families by family type by presence 
of related children under 18 years by 
age of related children 


Table B23022 


Sex by work status in the past 12 
months by usual hours worked per 
week in the past 12 months by weeks 
worked in the past 12 months for 
the population 16 to 64 years 


Table B15002 


Sex by educational attainment for 
the population 25 years and over 


Table C24010 


Sex by occupation for the civilian 
employed population 16 years 
and over 


% Black or African American, 
not Hispanic or Latino 


% White alone, not Hispanic or Latino 
% Hispanic (non-white) 
% Asian 

% Speak only English 

(Speak English Well) 
% Speak a language other than 
English + Speak English only or 

speak English “very well” (Billingual) 


% Speak a language other than 
English + Speak English less than 
“very well” (Speak Other Language) 


% Foreign-Born 


% of families with income in 
the past 12 months at or 
above poverty level 


% of Males Worked in the 
past 12 months 


Mean level of Education 
(in years) 


% employed as management, 
business, science, and arts 
occupations 
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Appendix C 


Additional Details About Results 


TABLE C.1 
The precise values in standard deviation units used in Figure 1 in the main text. 


Variable Coq fol U] oy | Group 2 Group 3 Group 4 Group 5 

(31%) (23%) (19%) (18%) (9%) 

® Black S75, -0.785 -0.787 -0.758 0.346 

© White -0.938 1.446 -0.565 0.541 =@2555) 

Hispanic -0.772 -0.497 1.81 0.192 -0.162 

Asian -0.535 0.21 -0.301 0.945 0.081 

™ Speak English Well 0.84 0.526 -1.621 -0.574 0.221 

© Bilingual -1.066 -0.323 1.22 0.836 -0.153 

' Speak Only Another Language -0.82 -0.516 1.633 0.315 -0.232 
(not English) 

Foreign-Born -1.017 -0.285 1.314 0.847 -0.122 

® Income and Employment 0.973 -1.257 0.054 -0.293 0.294 

‘ Education and Occupation -0.319 1.258 -1.168 0.09 0.136 
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